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W^e investigate further several properties of the minority 
game we have recently introduced. We explain the origin of 
the phase transition and give an analytical expression of a 2 /N 
in the N <C 2 M region. The ability of the players to learn a 
given payoff is also analyzed, and we show that the Darwinian 
evolution process tends to a self-organized state, in particular, 
the life-time distribution is a power-law with exponent -2. Fur- 
thermore, we study the influence of identical players on their 
gain and on the system's performance. Finally, we show that 
large brains always take advantage of small brains. 

Recently we have studied a simple model of a minority 
game |l[ that captures the essential features of the "bar- 
problem" of Arthur ( |^| and Q). A players compete with 
each other and act by induction and adaptation. They must 
choose one side between the two at each time step and those 
who happen to be in the minority side win. They receive a 
reward when making the right choice, keep in memory the 
M last sides which were the right ones and use this knowl- 
edge to act at the next time step. Each player possesses a 
finite set of S strategies and uses the one which would have 
been the most rewarding if it had been used since the begin- 
ning. A strategy is a behavior rule that stipulates an action 
for every information possible (see jjlor Q for more details). 
Recently, Savit, Manuca and RioloTq] have studied the step 
payoff case where each player has the same memory M and 
only two strategies. They have found a phase transition with 
parameter p — 2 M /N. 

Here, we continue the study of the payoff learning and the 
Darwinism's process initiated in Jl] and introduce other in- 
teresting sights of the model. We also give a geometrical 
explanation of the phase transition and find an analytical ex- 
pression of a' 2 /N in the A <C 2 M region. 



I. LEARNING OF DIFFERENT PAYOFFS 

The simplest payoff consists in giving a point for every good 
choice, i.e. the individual payoff function is equal to one, the 
global payoff is equal to w, the number of winners. It fits 
the minority game, but in most other real situations the win- 
ners must share limited resources. For example, consider the 
global payoff F a ,b(w) = aw + b. By varying a and b, we can 
reproduce some real situations; for instance, the lottery game 
corresponds to a = 0, b > and the so-called step payoff 
function is obtained with a > and 6 = 0. In general, a 
player wishes to maximize his profit, i.e. a priori the individ- 
ual payoff function f(w). It is tempting to gain the maximum 
of f(w) at each time step, but it occurs in average only ev- 
ery N/ (w) time steps, thus the system is expected to try to 
get the average number of winners giving the maximum of 
the global payoff function F(w) = f(w)w, say w max - The 
w max = 1 case reveals one limit of the model due its binary 



nature. Each player would like to win alone, but he has to 
face the reality : the minimum number of winners is equal to 
N/2 S in average, whose all strategies give the same answer to 
this information. If S — 2 and w max = N/2 it is a security, 
but it is harmful when A/2 3 > w max . If w max ~ A/2 S , N/2 S 
players sometimes win huge rewards, but it only happens in- 
termittently and always after the same information, because 
the other players are blinded by the enormous virtual gain; 
thus there are only two groups of roughly N/2 S people that 
can hit the jackpot. The morals is the following : one can 
win a lot, but not intentionally. If the global payoff has a 
maximum in w max greater enough than N/2 S , the system 
can manage to maximize the global payoff. For instance, we 
take f(w) = w(N/2-w), that is, F(w) = w 2 (N/2-w) has a 
maximum in A/3. If A = 101, M = 6 and S = 15, it is clear 
that the peaks are in w = A/3 (see figure |l]). 

Nevertheless there are conditions under which the system 
can get two peaks. In general, if p > p c the histogram of 
the attendance at side A can only have one peak centered 
on A/2. If p < p c the histogram has at least two peaks 
that are symmetrical with respect to A/2 and whose positions 
strongly depend on p, and, if w max = A/2, a peak centered 
in A/2. Consequently, if w max <C A, a system can maximize 
the global gain only if A > 2 M and A < 2 s . 
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FIG. 1. Histogram of the attendance ua at side A 
(A = 101, M = 6, S = 15). The system has maximized 
the global payoff w 2 (N/2 — w) by getting the typical number 
of winners equal to N/3 



II. DARWINISM 

The Darwinism process is the same as in [[ij : every r time 
steps, the worst player is replaced by a clone of the best, ex- 
cept that one strategy is redrawn with a small probability 
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p in order to allow regeneration, and that the virtual gains 
of the strategies are reset to zero, like a new born baby. If 
the new player is a pure clone of the best one, one says that 
both belong to the same species. A demonstration of the Dar- 
winism's benefits can be seen in figure || : the latter shows 
a comparison between the variance of the attendance signal 
with and without Darwinism. The region where a 2 /N < 1/4 
is much greater when evolution takes place, in particular the 
p < p c region is less affected by the overcrowding. Note that 
increasing p lowers a 2 jN, that is, the mutations are useful. 
The asymptotical behaviors of a 2 /N in the p — > oo limit de- 
pends on p, but the Darwinism is harmful in this region. One 
also sees that the minimum of a 2 /N is not as the same p than 
without Darwinism. One can wonder why the coordination is 
better in the crowded phase although there are multiple clones 
of a lot of players. Savit et al. Q pointed out the existence of 
a harmful process of period 2, more precisely, the first time 
an information is given to the system, N/2 + 0(\/iV) players 
choose one given side; the next time the same information 
occurs, N/2 + O(N) players go to the opposite side. When 
evolution takes part, due to the fact that the new player loses 
the virtual gains of the strategies he gets, the periodic process 
fades away because the harmful coordination vanishes. 



found in real life evolution ||). Figure ^ helps to understand 
what happens. The average gain of several players during 
the game is plotted. One can see players remaining in the 
game, some other resisting for a while, then disappearing. 
The player that replaces a dead one is followed. This figure 
shows that the fluctuations of the average gain is very high 
when a player is young. Consequently such a player's death 
or reproduction are more likely than those of an old player; 
that leads to punctuated equilibrium, explaining the origin of 
the power law distributions. 
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FIG. 2. Dependence of a 2 /N in p with Darwinism for 
p = 0.01 (stars) and p = 0.1 (squares) (M = 5, S = 2, 
r = 10). The dashed line represents the random performance 
and the dotted line is a 2 /N without Darwinism. 

Since p < 1, the diversity (the number of different species) 
is reduced by the evolutionary process, and tends to a value 
that depends on p. Indeed, if p = 1, the diversity remains 
equal to N; when p decreases, the diversity decreases too, 
but stays over N/2 even if p = 0. Even more, when one starts 
with only one species, but with random virtual gains, the 
system performs first very badly, slowly improves itself, and 
reaches an optimal diversity always greater than N/2. When 
the diversity is stable, the system is in a stationary state, 
and one can study several distributions. First, figure g shows 
the distribution of the species life time. It is a power law 
with exponent —2.02 ± 0.02, which does not depend on the 
parameters (r, p, ...). Fortunately, it is the same exponent as 
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FIG. 3. Distribution of species life time (N=101, M=8, 
S=2, NIT=500000, r=10). The straight line has a -2 ex- 
ponent. The right part of the distribution does not lie on the 
same line, because of fluctuations. 
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FIG. 4. Temporal evolution of the average gain of several 
players (N = 51, M = 10, S = 2, r=10). Note the conse- 
quences of the death of a player at t=1450 and t=2100. 

After sufficient time, it is interesting to plot the number 
of members composing each species against their rank (see 
figure |E]). One find a power law that depends at least on p; 
indeed, if p = 1, the best performer will never be completely 
cloned and one obtains a flat line : every type of player has 
only one member. 
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FIG. 5. Rank of the members in a type of player 
(TV = 20001, S = 5, M = 5, r = 10). It is a power-law, 
but the exponent depends on the system's parameters. 



below; ii) p ~ p c : the fluctuations are minimal and the sys- 
tem performs better than by coin-tossing; iii) p ^> p c : a 2 /N 
tends to 1/4, the random case's performance, because the 
players are not any more enough to coordinate. 

One sees that increasing the number of strategies reduces 
the p ~ p c region, and that a 2 /TV depends only on p if p is 
large. Furthermore, p c is roughly linear in S, as one can see 
on the following table (values are only approximative) : 
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In order to know if the proposed Darwinian process is the 
good one, we have tried to apply the inverse process : a clone 
of the worse player replaces the best one. The figure ^| shows 
that the global performance suffers a lot. 
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FIG. 6. Anti Darwinism : a clone of the worst player re- 
places the best. The attendance at side A is plotted with 
dots. 



III. EXPLANATION OF THE PHASE 
TRANSITION 

Let us now discuss in detail figure [?] where the phase tran- 
sition clearly appears. Considering the step payoff function, 
one has plotted a 2 /TV against p = 2 M /TV for S = 2, 3, 4, 5; the 
dashed line a 2 /TV = 1/4 gives the performance that would be 
attained if every player chose its side randomly. Let us call 
p c the X-coordinate where a 2 /TV is minimal. Three regions 



in this region, a /TV ~ l/p, that is, 



p -C pc 

1/2 M ; keeping M constant and increasing TV pro- 



appear : lj 
a 2 /TV 2 

duces only a dilatation of the system, whose origin is discussed 
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FIG. 7. Dependence of cr 2 /TV on p for S = 2, 3, 4, 5 and 6 
(circles, triangles, stars, x and diamonds) (M — 8). The con- 
tinuous line is the plot of 1/x, and the dashed one represents 
the random case's performance. 

This figure leads up to two questions. First, since a player 
completely ignores the details of the other players' strategies, 
how is it possible that the system performs better than by 
coin tossing ? Second, why is the parameter p — 2 M /TV the 
good one, that is, what is the meaning of p ? Indeed we know 

o M 

that there are 2 strategies. One would have expected the 
phase transition to occur when TV ~ 2 . Why is it not the 
case ? 

First it is clear that the only way the players can interact 
is the virtual values of their strategies. The coordination oc- 
curs because there is some information in these values. The 
nature of this information is the following. Since a strategy 
consists in 2 M bits, it belongs to an hypercube of dimension 
2 M , H M - For all s G H M , s(i) £ {0, 1}, i = 1, . . . , 2 M are the 
components of s. We take the distance on this hypercube, 
also called hamming distance, which counts the number of 
different bits : let s and t £ Hm, then 
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D M (s,t) = ^2\s(i) - t(i)\. 

i=l 

For convenience, we introduce the normalized distance 
d M (s,t) = D M (s,t)/2 M . Now, given an information, the 
probability for these two strategies of reacting the same way 
is given by 1 — d,M(s,t). Therefore, using the strategy with 
the highest virtual value is equivalent to choosing the one that 
has the smallest probability to reacting in the same way as 
the other players. It is clear now that the virtual value of a 
strategy is related to its average distance from all strategies 
used by all other players. Of course each player's dream is to 
have at each time step at least a strategy that has been in 
average further than 1/2 from all other players' used strate- 
gies. Let us define (d), the average actual distance between 
the players by computing the average distance between the 
strategies being used at this moment by the players at each 
time step, and by averaging this quantity over the whole his- 
tory of the game. One sees on figure M that (d) is actually 
greater than 1/2 as soon as a 2 /N < 1/4. Even more, there 
is a linear relationship between (d) and a 2 /N, as shown on 
figure []. 
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FIG. 8. Log-log plot of the average actual distance (d) 
against p (7V=51, M = 2, .., 11, S = 2). 
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FIG. 9. Linear relationship between a 2 /N and (d). The 
continuous line shows a linear fit. 

There are 2 different strategies. But take a strategy and 
change only one bit of it. The distance between these two 
strategies is 1/2 M , that is, they are not really different. The 
question is now : given s, a strategy, how many strategies are 
significantly different from s ? Of course, there is one strategy 
s that is exactly the inverse of s : s(i) = 1— s(i), i = 1, . . . , 2 , 
but one can consider too all strategies which are at a distance 
of 1/2 from s, so to say uncorrelated with s. Note that if 
s and t are uncorrelated, s is also uncorrelated with t. In 
appendix A, we show that an ensemble whose all elements 
are mutually uncorrelated contains at most 2 elements, and 
give a method to build such an ensemble given a strategy. 
Let s be a strategy and Um such an ensemble that contains s. 
Then let U m be such that for all t in Um, 1 belongs to Um- 
All elements of Um, except s itself are uncorrelated with s, as 
do all elements of Um, except s. So, given a strategy s, there 
are 2 • 2 M — 2 strategies uncorrelated with s, and, of course, 
one totally anti-correlated. This property clarifies the number 
of strategies which are really different : it is the cardinal of 
V M = Um U Um, 2 ■ 2 M . The ensemble V M can be called the 
reduced strategy space. Note that given a strategy s there are 

2 M — 1 

2 Um that contain s, but that Vm is unique in the sense 
that if t € Vm and t G V' m, then Vm = V m- Of course, 
a given strategy always belongs to such an ensemble, related 
to Vm by a given number of reflections. In other words Vm 
defines a geometrical structure on the hypercube Hm- 

It is clear now why the quantity is a fundamental pa- 
rameter : it is proportional to the number of drawn strategies 
over the cardinal of Vm, i.e. to the inverse of the density of 
the system in the reduced strategy space. Therefore, the right 
parameter should be 2 'g N . 

The fundamental role of Vm is shown by the following ex- 
perience. Since it is easy to construct Vm, we can model our 
model by forcing the players to draw their str ateg ies from a 
given Vm- We plot a 2 /N versus p (see figure |TJ). One see 
that a 2 /N is really very close to the original one. What do 
we learn from this comparison ? First in the crowded phase 
(p <C p c ), we saw that a 2 /N 2 oc 1/2 A/ . If M is constant, a 
system that only differs from another one by N is just a di- 
latation of the latter. When all strategies belong to Vm , 2 s 2 m 
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is the average number of times each strategy has been drawn, 
thus the cause of the dilatation is obvious. 



are (I) pairs of anti-correlated players, only N — 2 (I) players 
are still uncorrelated, and 
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Here, 



N 



N -1/2 
2 • 2 M ' 



(3) 



(4) 



Note that a 2 h /N tends to 1/4 like N/2 M . Since (I) is the 
average maximal number of anti-correlated strategies, a 2 h /N 
is a kind of lower bound for a 2 /N as it appears on figure 
[ll] , on which one can see that this approximation holds for 
N < 2 M . One can also analytically find that a 2 /N linearly 
depends on (d) at least in this region. Indeed, using equation 
(H), the fact that 
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FIG. 10. a 2 /N versus p obtained by drawing all the strate- 
gies from Vm (M = 8, S = 2). The continuous line shows the 
original a 2 /N , and the dashed line indicates the performance 
of the random case. The two curves are very close together, 
testifying the fundamental role of Vm 

Let us now discuss the better-than-random phase. It is 
tractable analytically in the p ^> 1 limit, since two strate- 
gies are either the same, or uncorrelated, or anti-correlated, 
which implies that two players are not bound together un- 
less they have identical or anti-correlated strategies. Suppose 
that S — 2; if N <C 2 M , it is easy to compute the average 
maximal number of anti-correlated strategies, (l). Under the 
hypothesis that a strategy is drawn at most once and that a 
player cannot possess two anti-correlated strategies, the joint 
probability that the system can have I pairs of anti-correlated 
strategies and that n drawn strategies belong to Um is given 
by 



p(l U n) 



r m n 2 M -n f~i2 
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n 

c? c 2 
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<--7) 



Lj 2N-n £ 



l(n-l) 



Kn-l), 



(1) 



where n, respectively, n, is the number of strategies be- 
longing to Um, respectively. Um (of course n + n — 2N), and 
0{x) is the Heavyside function. Thus the average number of 
anti-correlated strategy in the system is equal to 



and dropping the terms of order 1 /N 2 yield to the equation 



^3- C (<*>-5 



(6) 



where C is a constant equal to 1 in this case, but depends 
at least on M and iV in the original model's case; for instance, 
C ~ 0.0302 on figure §. Equation (|) indicates that a 2 /N = 



1/4 if (d) = 1/2 and that a 2 /N < 1/4 



as soon as 



(d) > 1/2 



(see figures ^ and . One might find analytical relationships 
in the other regions by allowing a strategy to be drawn several 
times. 
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FIG. 11. Theoretical (continuous line) and original 
1/4 - g 2 /N versus p = 2 M /N for M = 8 (squares), M = 10 
(stars) and M = 12 (circles) (S — 2). The fluctuations are 
quite important due to the small number of players. One sees 
the approximation is good if TV 2 M 



Obtaining a 2 /N is now straightforward. If the system con- 
sists of TV uncorrelated players, a 2 /N — 1/4. Now, if there 
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IV. INFLUENCE OF IDENTICAL PLAYERS 



Thus, 



In this section, we study the effect of cloning a player sev- 
eral times on the system and on the player. One starts with 
n players having exactly the same strategies and all virtual 
values set to zero. Figure ^| shows the average gain of such 
players against their relative number in the system n/N - 
note that one keeps N constant. One sees that their gain 
is well fitted by a quadratic function and is equal to zero at 
n m =~ 20. 
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FIG. 12. Average gain of n identical players against n/N 
(circles), (g) — B(n) with w m = 38 (dashed line) andui m = 30 
(dotted line) (N = 101, M = 10, S = 2, NIT=5000). 

Calculating the influence of n on the identical players is 
easy : suppose that at a given time step the cloned player 
wins if n — 0; let w be the number of winners ; if w + n < N/2, 
nothing changes for them. But if w + n > N/2, the identical 
players and those who made the same choice lose. That means 
in particular that the cloned player wins less than before. One 
can estimate the prejudice B for the cloned player : 

(JV-1J/2 

B(n) = P M (9)o 9(w + n- N/2) 

w=0 

(JV-l)/2 

= (a) X) PW ' (7) 

tu=(W+l)/2— n 

where {g) is the average gain of the cloned player when n = 
and P(w) is the probability that the system gets w winners. 
Of course one does not know P(w), but figure [T^ shows that 
one can well approximate P(w) by a linear function, 



P(w) 



if w < w m 

aw + b if Wm < w < N/2 



(8) 



where a and b are such that P(w m ) = and P is normal- 
ized, that is, b = —aw m and 

2 



JV 2 



W m {N — W m ) 



(9) 



B{n) * - 



We eventually define 



(g) • 



(9) (n) = <5> (1 - B(n)) . 



(10) 



(11) 



Before one can plot (g) (n), one must find the numerical 
value of w m . One has two choices. The first one is the mean 
field way : one suppose that the behavior of the remaining 
non-cloned players does not differ from the n — case, and 
one tries to map the form of P„ A by taking w m — 38 ; the 
resulting (g) (n) is the dashed line on figure |l^ and apperas 
to be too pessimistic. The second one is to consider that 
the adaptative way by whom all the players, including the 
identical ones, use their strategies allows to take w m such 
that Pn A (x < w m ) — 0, that is, w m = 30 ; the result is 
the dotted line on the same figure and is very close to the 
experimental data. 
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FIG. 13. Probability distribution of attendance at side A, 
P„ A (same parameters as on figure p^ ). 

If one plots a 2 /N against n/N (figure [yj), one sees that 
a 2 /N remains roughly constant when n/N is small, then 
grows up. If n/N > w m , a 2 /N grows like (n/N) 2 ; indeed, 
since the identical players always lose, the number of losers 
grows linearly with n. One the other hand, figure ^ shows 
the average gain of the non-cloned players. One note that 
they take advantage of the situation as soon as n > and 
that the average gain grows beyond 1/2 and stays roughly 
constant as soon as the average gain of the identical players 
fall to zero ; it is a consequence of the adaptative use of the 
strategies 
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FIG. 14. a 2 /N against n/N. The right part grows like 
(*)'■ 
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FIG. 15. Average gain of the non-cloned players 

Note that the studied system was such that p > p c . If 
p < p c , the prejudice B(n) is still a growing with n function, 
but approximating P(w) is more difficult since the histogram 
of the attendance at side A has three peaks, and accordingly 
B(n) is more complicate. One sees on figure [TJ that a 2 /N 
has minimum, because the actual number of different players 
is N — n, consequently p increases when n grows. 
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FIG. 16. <t 2 /N against n/N. Note that a 2 /N has a mini- 
mum. 



V. EFFICIENCY 

One interesting feature of our model is the following : sup- 
pose that p < p c , then the game is efficient in the sense that 
given an information, the two sides win with equal probability 
at next time step; reversly, if p > p c , the game is inefficient. 
Even more, suppose that Nm players have the same memory 
M and that the game is efficient, then the game is inefficient 
for hypothetical players who would have a memory equal to 
M + 1 JE|. Since the latter can take advantage from the sit- 
uation, it is interesting to add Nm+i of them to the game 
in order to see what they can win and when the inefficieny 
disappears for the M + 1 memory. On figure [It] one plots the 
average gain of the profiteers and of the victims. The latter 
grows a little bit and then stays roughly constant; indeed, the 
cleverness of the profiteers compensates the bad performance 
of the victims. The gain of the profiteers starts from over 
1/2, linearly decreases until Nm+i ~ Nm, falling below 1/2 
at Nm+i = Nm/2, and then decreases less than linearly, but 
it is not clear whether their gain eventually stabilizes above 
the victims' one. Therefore, one characterizes the inefficiency 
of the system for players with memory M by au defined by 

^^E™^- 1 ^) 2 , (12) 
i=i 

where i an information of length M and P{A\i) is the con- 
ditionnal probability that the winning side will be A at next 
time step given the information i. If au = 0, the system is 
efficient. When one plots Nm+i&m+i against Nm+i, one sees 
that ctm+1 behave like 1/JVm+i when Nm+i is roughly greater 
than Nm, that is, <jai+i decreases very slowly. That indicates 
that the profiteers can always take advantage of the victims. 
Finally, figure ^ shows that the average gain of the profiteers 
is monotonically related to <jm+i, since the inefficiency ctm+i 
is a measure of the opportunist gains the profiteers can get. 
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FIG. 17. Average gain of the profiteers (circles) and of the 
victims (squares) against the number of profiteers (Nm =101, 



N M +i = 2, . . . , 300, M = 3, S = 2). 



FIG. 19. Monotonical relationship between the average 
gain of the profiteers and <Jm+i- 



VI. EXTENSIONS OF THE MODEL 




FIG. 18. Nm+io~m+i against Nm+i, showing that 
&m+i ~ 1/Nm+i when N M +i > N M - 



Although being simple, our model is very rich and allows a 
lot of variations. For instance, the minority game seems to be 
a special case of the bar problem, but one should not think the 
binary nature of our game prevents the system from learning 
a different from N/2 threshold. Indeed, one just has to modify 
the payoff by breaking the symmetry between the sides A and 
B : the players at side A win a point if they are less than aN 
(0 < a < 1); reversely, if there are more than aN players 
at side A, those who choose the side B win. The results are 
roughly the same as before, except that the game is no more 
symmetrical. On the other hand, one should argue that our 
model would be more realistic if the oldest virtual gains of 
the strategies were not kept in memory. One can modify the 
model in such a way that only the T newest virtual gains 
are taken into account to determine which strategy a player 
uses. This modification does not change the properties a lot, 
in particular figure [j], if T is not too small with respect to 
M, lets the model be Markovian of order T and might help 
to exactly solve the model. One can also imagine weighing 
the virtual gains, that is, multiplying the reward gained t 
time steps ago by c where < c < 1 in order to include a 
progressive oblivion; one might find a critical value of c, as in 
the prisoner's dilemma Q. 



VII. CONCLUSION 



We have reviewed some interesting properties of our model; 
in particular, we have given a geometrical explanation of the 
phase transition, which has allowed us to find an analytical 
expression of a 2 /N in the p ^> 1 region. Nevertheless, an 
analytical solution of the whole model is still missing, being 
rather hard to find due to the adaptative use of the strategies. 
Consequently, one could be interested in another approach 
proposed by one of us Q : one only gives one binary strategy 
to each player; there are active and passive players; the active 
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ones play the minority game, the passive ones observe, waiting 
for the best moment to play. This model might reproduce 
some of the properties of ours and permit an easier analytical 
approach. 

This work has been supported in part by the Swiss National 
Foundation through the Grant No. 20-46918.96. 



APPENDIX A: 

In this appendix we show that an ensemble of strategies 
whose all elements are uncorrelated contains at most 2 M el- 
ements. The proof also gives a method to build such an en- 
semble given a strategy s G Hm- Without loss of generality, 
we consider s(i) = 0, i = 1, . . . , 2 M . We note Um a maximal 
subset of H M such that all its elements are mutually uncorre- 
lated and that it contains s; note that Um is not unique (see 
above). It is easy to build Um from Um-i- Ha, b G ffju-i 
one defines the direct product a (g) b by 

_ / a (i) if 1 < * < 2 M_1 

(a ® b)(i) 2 m_ 1} . f 2 m_, Ki < 2 M 

This product actually extends a G Hm-i in by ap- 
pending the components of b. The ensemble Um is simply 
obtained from Um-i by taking all a G Um-i, and putting 
a ® a and a <g> a in Um ■ It is easy to see that all elements of 
Um are mutually uncorrelated. Since Uo = {(0)}, Um con- 
tains 2 M elements and one proves that its size is maximal by 
recurrence : 

(i) clearly, Ui = {(0, 0); (0, 1)} is as large as possible. 

(ii) suppose that the size of Um-i is maximal. 

(iii) let h G Hm; suppose that V u G Um (w =fc h), 
d,M(h,u) — 1/2. Consider now one a G Um, a ^ h; one 
can decompose h in hi ® /12 where hi, hi G Hm-i, and a in 
6 ® c where b,c £ U m-i ; remember that c = & or c = 6. Since 
6 <g> c and 6 <g> c G ?7m, 

cLmQi, a) — — (d,M-i(hi, b) + c?M-i(/i2, c)) 

= g (dM-l(/ll, 6) + d,M-l{h2, c)) 

= i(dM-i(/n,&) + l-dM-i(ft 2 ,c)) = 1/2 (Al) 

Thus dM-i(h'2,c) = 1/2, then hi G Um-i, as does fti, 
that is, h £ Um- In other words, Um contains at most 2 M 
elements. 
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